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Abstract 

Background: Diagnosis at an early stage of chronic pancreatitis (CP) is challenging. It has been reported that microRNAs 
(miRNAs) are increasingly found and applied as targets for the diagnosis and treatnnent of various cancers. However, to 
the best of our knowledge, few published papers have described the role of miRNAs in the diagnosis of CP. 

Method: We downloaded gene expression profile data fronn the Gene Expression Onnnibus and identified differentially 
expressed genes (DEGs) between CP and normal samples of Harlan mice and Jackson Laboratory mice. Common DEGs 
were filtered out, and the semantic similarities of gene classes were calculated using the GOSemSim software package. 
The gene class with the highest functional consistency was selected, and then the Lists2Networks web-based system 
was used to analyse regulatory relationships between miRNAs and gene classes. The functional enrichment of the gene 
classes was assessed based on Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes pathway annotation 
terms. 

Results: A total of 405 common upregulated DEGs and 7 common downregulated DEGs were extracted from the two 
kinds of mice. Gene cluster D was selected from the common upregulated DEGs because it had the highest semantic 
similarity. miRNA 124a (miR-124a) was found to have a significant regulatory relationship with cluster D, and DEGs such 
as CHSYl and ABCC4 were found to be regulated by miR-124a. The GO term of response to DNA damage stimulus and 
the pathway of Escherichia coli infection were significantly enriched in cluster D. 

Conclusion: DNA damage and £ coli infection might play important roles in CP pathogenesis. In addition, miR-124a 
might be a potential target for the diagnosis and treatment of CP. 
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Background 

Chronic pancreatitis (CP) is characterized by pancreatic 
inflammation and fibrosis, and it arises when pancreatic 
injury is followed by a sustained immune activation in 
which fibrosis dominates [1]. Environmental triggers of 
pancreatic inflammation and disease susceptibility (such 
as alcohol use, smoking, pancreatic duct obstruction and 
drugs) or modif)^ing genes (including PRSSl, SPINKl 
and CFTR) act synergistically to cause CP [1,2]. It has 
also been indicated that CP is often an underlying cause 
of pancreatic cancer [3]. Meanwhile, in recent years, re- 
searchers in a growing number of studies have suggested 
that microRNAs (miRNAs) play an important role in the 
diagnosis and prognosis of pancreatic cancers [3-6]. 
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miRNAs inhibit the transcription levels of mRNA, in- 
duce degradation of the regulation of gene expression 
[7] and have been proved to be involved in many dis- 
ease processes. Therefore, the identification of miRNA 
changes might explain the pathology of CP in another 
way and provide a new method for diagnosing CP. 

A number of miRNAs that have been studied have a 
role in pancreatic diseases. By comparing pancreatic 
cancer tissue to CP tissue and normal pancreas, Bloom- 
ston and colleagues identified 21 miRNAs with increased 
expression and 4 with decreased expression, which sug- 
gests that the miRNAs likely play an important regulatory 
role in pancreatic cancer [3]. It has also been demon- 
strated that the expression of miRNA- 196a (miR-196a) is 
high in pancreatic ductal adenocarcinoma (PDAC) but 
low in CP and normal tissues, whereas miR-217 exhibits 
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the opposite expression pattern [8]. The ratio of miR-196a 
to miR-217 has been found to indicate whether tissue 
samples contain PDAC [9]. More and more miRNAs have 
been found to be related to pancreatic cancers, and CP 
specimens are often used as a second control [3,9]. How- 
ever, few published papers have specifically described the 
relationship between CP and its miRNAs. 

In the present study, we analysed the gene expression 
profile of CP and normal mice to screen for differentially 
expressed genes (DEGs). We identified the related miR- 
NAs, which might provide further insights into the mo- 
lecular mechanisms of CP. Understanding the molecular 
mechanisms of CP might aid in diagnosing and treating 
CP patients. 

Methods 

Data sources 

We downloaded a gene data set [GEO:GSE41418] [10] 
from the Gene Expression Omnibus (GEO) database (http:// 
www.ncbi.nlm.nih.gov/geo/). Gene expression analysis was 
performed on a GeneChip Mouse Genome 430 Plus 2.0 
Array platform (Affymetrix, Santa Clara, CA, USA). The data 
set contains two different kinds of mice: Harlan mice 
(C57BL/6NHsd; Harlan Laboratories, Indianapolis, IN, 
USA) and Jackson Laboratory mice (C56BL/6 J; The 
Jackson Laboratory, Bar Harbor, ME, USA). A frequently 
used experimental model of CP recapitulating human dis- 
ease is repeated injections of cerulein into mice. We found 
that two common substrains of C57BL/6 mice (C56BL/6 J 
and C57BL/6NHsd) exhibit different degrees of CP, with 
C57BL/6 J mice being more susceptible to repetitive 
cerulean-induced CP. The goal of this study was to iden- 
tify genes associated with CP and to identify differentially 
regulated genes between two substrains as candidates for 
the CP progression. We included six mice of each type, in- 
cluding three CP samples and three normal samples [10]. 

Identification of differentially expressed genes 

Expression profile data were normalized with GeneChip 
robust multiarray analysis [11]. Next, we preprocessed the 
data derived from 12 samples for subsequent analysis. We 
annotated expression profiling probes to gene symbols. If 
there were multiple probe sets that corresponded to one 
gene, the expression values of those probe sets were aver- 
aged. Using this method, we obtained an expression data 
set comprising 21,389 genes. Afterward, Significance Ana- 
lysis of Microarrays 4.0 software [12] was used to screen 
the DEGs between the CP samples and normal controls of 
the two kinds of mice, respectively. The overlapping DEGs 
were denoted as common DEGs and were used for further 
analysis. A fold discovery rate (FDR) <0.05 was selected as 
the threshold for screening DEGs. 



Gene cluster analysis of common differentially expressed 
genes 

Gene cluster analysis can be used to divide genes into 
several classes based on certain similarity criteria, such 
as the Pearson correlation coefficient or Euclidean dis- 
tance [13,14]. It has been proved that genes in the same 
cluster have a high degree of homogeneity. In our 
present study, we used the second-order tolerance ana- 
lysis (SOTA) method [15], a toolset of gene expression 
profile analysis [16], to perform cluster analysis on the 
common DEGs based on the gene expression values. 
The Euclidean distance was employed as the clustering 
indicator. Next, we calculated the semantic similarity of 
gene classes using the GOSemSim software package 
[17], and the class of genes with the highest functional 
consistency was selected as the optimal gene cluster for 
further study. 

Related microRNAs of optimal gene cluster and GO and 
KEGG pathway analysis 

In organisms, highly coexpressed genes are likely to share 
common regulatory patterns and to participate in the 
same or similar biological processes and pathways [18]. In 
order to study the regulatory mechanisms of the optimal 
gene cluster, we used the Lists2Networks web-based sys- 
tem [19] to analyse the possible relationship between the 
miRNAs and the optimal gene cluster. The ftinctional en- 
richment of the target genes of two regulators (transcrip- 
tion factors and miRNAs) was assessed based on the Gene 
Ontology (GO) and Kyoto Encyclopedia of Genes and 
Genomes (KEGG) pathway annotation terms. GO and 
KEGG signalling pathway analyses were performed using 
the GOstats R package software package (http://www.r- 
project.org/), with which we carried out the standard 
hypergeometric test. We was also performed GO and KEGG 
enrichment analysis on the gene cluster, with P-values 
less than 0.05 considered statistically significant. 

Results 

Identification of differentially expressed genes 

According to the predetermined FDR threshold <0.05, 
962 DEGs of Harlan mice, including 911 upregulated 
genes and 51 downregulated genes, were screened out. 
In Jackson mice, a total of 1,545 genes were differentially 
expressed, and these DEGs comprised 1,423 upregulated 
genes and 122 downregulated genes. Next, we extracted 
overlapping DEGs in both mice, which consisted of 405 
upregulated genes and 7 downregulated genes (Figure 1). 
We clearly observed that the number of upregulated 
genes was significantly greater than that of downregu- 
lated genes. We speculate that these upregulated genes 
might play a major role in CP disease. In the experimen- 
tal work following this observation, we analysed only the 
upregulated common DEGs. 
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Up Genes Down Genes 

Figure 1 Common differentially expressed genes of the two mouse breeds studied. The red and blue parts represent, respectively, the 
upregulated common differentially expressed genes (DEGs) and downregulated common DEGs. 



Gene clustering of upregulated common differentially 
expressed genes 

Using the Euclidean distances as the clustering indicators 
in SOTA, we obtained four clearly separated gene classes 
(Figure 2) of the upregulated common DEGs. Next, we 
calculated the semantic similarity scores of gene classes 
(Table 1). As a result, gene cluster D was found to have 
the highest average semantic similarity score (0.2868) and 
was selected for further analysis. 

Related microRNAs and functional analysis of the optimal 
gene cluster 

According to the enrichment analysis of Lists2Networks, 
miR-124a was found to have a significant regulation rela- 
tionship with cluster D (Table 2). And genes such as CHSY 
(chondroitin sulphate synthase 1) and ABCC4 (ATP-bind- 
ing cassette, subfamily C (CFTR/MRP), member 4) were 
enriched and in correlation with miR-124a. According to 
GO and KEGG pathway enrichment on gene cluster D, 
we found that the most significant biological process was 
response to DNA damage stimulus (Table 3), and PAPR3 
was one of the significant DEGs enriched in the GO term. 
The observed significant pathways were associated with 
the cell cycle and Escherichia coli infection (Table 4). 

Discussion 

In the present study, we screened out 405 common upreg- 
ulated DEGs of the two kinds of mice used, and GOSem- 
Sim was used to calculate the semantic similarity of the 
gene clusters of the DEGs. Cluster D was selected as the 
optimal gene class for further investigation because of it 
had the highest average semantic similarity. Using the 



Lists2Networks, we found that cluster D could be signifi- 
cantly regulated by miR-124a, which might play an im- 
portant role in the development of CP. 

miR-124a was first identified by cloning studies in 
mice [20]. Studies have shown that miR-124a plays an 
important role in the control of cell survival, prolifera- 
tion, differentiation and metabolism and whose dysfunc- 
tion is a potential cause of disease [21-23]. In addition, 
published data have demonstrated that miR-124a expres- 
sion level was increased in the mouse pancreas at the em- 
bryonic stage and have indicated its important role in 
pancreas development [23]. Therefore, we hypothesized 
miR-124a might play an important pathogenic role in CP. 

CHSYl encodes a member of the chondroitin N- 
acetylgalactosaminyltransferase family, possesses dual glu- 
curonyltransferase and galactosaminyltransferase activity 
and plays critical roles in the biosynthesis of chondroitin 
sulphate, a glycosaminoglycan involved in many biological 
processes, including cell proliferation and morphogenesis 
[24-26] . CHSYl was one of the significant genes in cluster 
D and was enriched and regulated by miR-124a. Re- 
searchers in a previous study demonstrated that CHSYl 
regulated its downstream target CASPl (caspase 1, also 
known as interleukin l|3-converting enzyme), which 
could cleave interleukin 1|3 precursors into mature cyto- 
kines and contribute to inflammation [27]. Surprisingly, 
increased expression of CASPl has been reported to be a 
frequent event in CP [28]. Thus, miR-124a might partici- 
pate in CP manifestation and development by regulating 
expression levels of CHSYl or CASPl, 

ABCC4 is another significant gene regulated by miR- 
124a. It is a member of the ATP-binding cassette 



D 
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Figure 2 Dendrogram used for clustering analysis of the common upregulated differentially expressed genes. As sliown in tine diagram, 
tlie genes are divided into four categories (A, B, C and D). 
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Table 1 Semantic similarity scores of the gene clusters 



Cluster 


Gene number 


Semantic similarity score 


Cluster A 


51 


0.2145412 


Cluster B 


91 


0.2525834 


Cluster C 


121 


0.272545 


Cluster D 


110 


0.2867982 



transporter superfamily, which has been shown to com- 
prise key mediators of drug efflux and multidrug resist- 
ance in many types of tumours and inflammatory 
diseases [29-31]. A previous study also been implicated 
ABCC4 as an efflux pump of proinflammatory mediators 
such as LTB4 and LTC4, and ABCC4 may represent a 
novel target for anti-inflammatory therapies [32]. There- 
fore, miR-124a might regulate the inflammatory disease 
of CP by changing the levels of proinflammatory media- 
tors by ABCC4. 

On the basis of the results of GO enrichment analysis 
of gene cluster D, the most significant biological process 
we observed was the response to DNA damage stimulus. 
This suggested that DNA damage might play an import- 
ant role in the pathogenesis of CP. The results of our 
analysis are in line with those of a previous study [33]. 
PARP3 is one significant gene that is enriched in the bio- 
logical process of response to DNA damage stimulus. It 
belongs to the poly(ADP-ribose) polymerase (PARP) 
family [34]. PARP3 catalyses the reaction of ADP ribosy- 
lation, a key posttranslational modification of proteins in- 
volved in different signalling pathways from DNA damage 
to energy metabolism and organismal memory [35]. In 
addition, recent studies have clearly demonstrated the role 
of PARP activation in various forms of local inflammation 
[36-38]. Information about the role of PARP3 in CP is 
sparse; however, it has been shown that other members of 
the PARP family, such as PARPl, coactivate the transcrip- 
tion factor nuclear factor kB (NF-kB) and is required for 
NF-KB-mediated inflammatory responses [39] . CP is char- 
acterized by pancreatic inflammation, thus PARP3 might 
potentially play a role in its inflammatory processes. 

In KEGG pathway analysis, it has been shown that 
E. coli infection might play an important role in CP. 
Karmali and colleagues reported that infection with E, coli 
produced postdiarrhoeal haemolytic uraemia syndrome 
and that many patients who recovered from it had long- 
term sequelae, including CP and cholelithiasis [40,41]. 



Table 2 Regulatory microRNAs predicted for cluster D 



Term 


Genes 


Combined score 


P-value 


TGCC^A, mlR-124a 


CHSYl, ABCC4, CASC4 


8.818 


0.00828 


TCTAGAG, mlR-517 


SLC39A10, RND3 


7.065 


0.03707 


TAGAACC, mlR-182 


FAM107B, RBM12 


6.138 


0.02771 



Table 3 Gene Ontology database enrichment analysis of 
cluster D 



Gene Ontology term 


Genes 


P-value 


Response to DNA damage stimulus 


PARP3JOP2A, RAD51 


0.0025 


DNA metabolic process 


PRIMl, RFCl, PARP3 


0.0041 


Chromosome organisation 


CDCA8, H2AFX, RFCl 


0.0047 


Chromosome condensation 


NCAPH, T0P2A 


0.0065 


Chromosome segregation 


MIS12JOP2A 


0.0073 


Cellular response to DNA damage 
stimulus 


TOPBPl, PARP3, T0P2A 


0.0080 


mRNA export from nucleus 


AGFGl, RAEl 


0.0089 


DNA packaging 


NCAPH, T0P2A 


0.0106 


Response to ionizing radiation 


TOPBPl, H2AFX 


0.0115 


RNA export from nucleus 


AGFGl, RAEl 


0.0124 


Endocytosis 


SNX4, SFTPD,CD14 


0.0150 


Regulation of ubiquitin-protein 
ligase activity 


BUB3,CDC23,CCNB1 


0.0155 


RNA transport 


AGFGl, RAEl 


0.0166 


DNA repair 


TOPBPl, PARP3,TOP2A 


0.0170 


Nuclear export 


AGFGl, RAEl 


0.0290 


DNA replication 


RFCl, T0P2A, MCM5 


0.0324 


Organelle organisation 


CDCA8, BUB3, H2AFX 


0.0357 


Response to stress 


PARP3, PHLDA3, LSPl 


0.0359 


Cellular response to stimulus 


PARP3,TOP2A, RAD51 


0.0375 


Innate immune response 


TUBB2C, SFTPD 


0.0441 



Furthermore, E. coli might also lead to pancreatic abscess, 
which is defined as an acute inflammatory process of the 
pancreas [42]. It has been proved that E, coli organisms 
can induce polymorphonuclear leucocyte infiltration dur- 
ing clinical infection [43]. Therefore, we suggest that 
E. coli infection might be involved in the occurrence of CP. 

This study has some limitations. First is the small sam- 
ple size obtained from the GEO database. Second, valid- 
ation of the results in other data sets or samples is 
lacking. Therefore, further genetic studies with larger 
sample sizes and different kinds of CP samples are 
needed to confirm our observations. 



Table 4 KEGG enrichment analysis of cluster 



KEGG_PATHWAY 


Genes 


P-value 


hsa04110 cell cycle 


RBL1,CDC23, MCM5 


1 .80E-04 


hsa05131 enteropathogenic 
E. coli infection 


TUBB2A,TUBB2C, CD14 


0.007263 


hsa05130 enteropathogenic 
E. coli infection 


TUBB2A,TUBB2C, CD14 


0.007263 


hsa04115 p53 signalling pathway 


CCNB1,CCNB2, APAFl 


0.013926 


hsa04640 hematopoietic cell lineage 


CD38, ILIRI, CD14 


0.026877 


hsa00600 sphingolipid metabolism 


SGMSl, B4GALT6 


0.034673 



'E co//, Escherichia coli; KEGG, Kyoto Encyclopedia of Genes and Genomes. 
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Conclusions 

miR-124a provides some guidance for the mechanism of 
CP pathogenesis and is a potential target for the diagno- 
sis and treatment of CP. miR-124a might participate in 
CP occurrence and development by regulating expres- 
sion levels of CHSYl or CASPl. Also, miR-124a might 
regulate the inflammatory disease of CP by changing the 
level of proinflammatory mediators by ABCC4. In addition, 
DNA damage and E. coli infection might play important 
roles in CP pathogenesis. 
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